In information theory, the conditional entropy (or equivocation) quantifies the remaining entropy (i.e. uncertainty) of a random variable given that the value of another random variable is known. It is referred to as the entropy of conditional on , and is written . Like other entropies, the conditional entropy is measured in bits, nats, or bans.
Contents |
More precisely, if is the entropy of the variable conditional on the variable taking a certain value , then is the result of averaging over all possible values that may take.
Given discrete random variable with support and with support , the conditional entropy of given is defined as:
The last formula above is the Kullback-Leibler divergence, also known as relative entropy. Relative entropy is always positive, and vanishes if and only if . This is when knowing tells us everything about .
Note: The supports of X and Y can be replaced by their domains if it is understood that should be treated as being equal to zero.
From this definition and the definition of conditional probability, the chain rule for conditional entropy is
This is true because
Intuitively, the combined system contains bits of information: we need bits of information to reconstruct its exact state. If we learn the value of , we have gained bits of information, and the system has bits of uncertainty remaining.
if and only if the value of is completely determined by the value of . Conversely, if and only if and are independent random variables.
In quantum information theory, the conditional entropy is generalized to the conditional quantum entropy.
For any and :
, where is the mutual information between and .
, where is the mutual information between and .
For independent and :
and
Although the specific-conditional entropy, , can be either lesser or greater than , can never exceed when is the uniform distribution.